Sequence Graph Transform (SGT)

نویسندگان

  • Chitta Ranjan
  • Samaneh Ebrahimi
  • Kamran Paynabar
چکیده

A ubiquitous presence of sequence data across fields, like, web, healthcare, bioinformatics, text mining, etc., has made sequence mining a vital research area. However, sequence mining is particularly challenging because of absence of an accurate and fast approach to find (dis)similarity between sequences. As a measure of (dis)similarity, mainstream data mining methods like k-means, kNN, regression, etc., have proved distance between data points in a euclidean space to be most effective. But a distance measure between sequences is not obvious due to their unstructuredness — arbitrary strings of arbitrary length. We, therefore, propose a new function, called as Sequence Graph Transform (SGT), that extracts sequence features and embeds it in a finite-dimensional euclidean space. It is scalable due to a low computational complexity and has a universal applicability on any sequence problem. We theoretically show that SGT can capture both short and long patterns in sequences, and provides an accurate distance-based measure of (dis)similarity between them. This is also validated experimentally. Finally, we show its real world application for clustering, classification, search and visualization on different sequence problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recruitment of the ventral and dorsal streams in statistical graph comprehension: An fMRI study.

BACKGROUND Although many previous studies have focused on statistical graph comprehension in cognitive psychology, there is no consensus among them. OBJECTIVE Brain neuroimaging studies on the statistical graph comprehension are useful to account for the cognitive mechanism of interpreting statistical graphic information. METHODS The present study used two experimental conditions, a statist...

متن کامل

Incorporating Latent Semantic Indexing into Spectral Graph Transducer for Text Classification

Spectral Graph Transducer(SGT) is one of the superior graph-based transductive learning methods for classification. As for the Spectral Graph Transducer algorithm, a good graph representation for data to be processed is very important. In this paper, we try to incorporate Latent Semantic Indexing(LSI) into SGT for text classification. Firstly, we exploit LSI to represent documents as vectors in...

متن کامل

Performance Analysis of Concurrency Control Using Locking with Conditional Blocking

There is growing evidence that for a wide variety of database workloads and system configurations the two-phase locking (2PL) outperforms other types of concurrency control schemes. However, in the presence of long-lived transactions (LLTs), 2PL surrenders to a problem of long delay suspension because LLTs are qualified to lock data until they commit. To alleviate this problem we propose an ext...

متن کامل

Decreased expression of Small glutamine-rich tetratricopeptide repeat-containing protein (SGT) correlated with prognosis of Hepatocellular carcinoma.

Small glutamine-rich tetratricopeptide repeat-containing protein (SGT) is an ubiquitously expressed cochaperone of heat shock cognate protein of 70 kDa (Hsc70). SGT binds to the C terminus of Hsc70 to recruit Hsc70 into complexes of diverse function. SGTB was identified as an isoform of SGT with 60% amino acid sequence homology. To investigate the expression of SGTB in hepatocellular carcinoma ...

متن کامل

A Framework for Stochastic System Modelling and Analysis: Work in Progress

Stochastic Graph Transformation combines the benefits of graphical modelling with stochastic analysis techniques. In this paper we report on our framework Sma for Stochastic Modelling and Analysis, and SGT , a tool which uses the framework for Stochastic Graph Transformation.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016